I3CL: Intra- and Inter-Instance Collaborative Learning for Arbitrary-Shaped Scene Text Detection
نویسندگان
چکیده
Existing methods for arbitrary-shaped text detection in natural scenes face two critical issues, i.e., (1) fracture detections at the gaps a instance; and (2) inaccurate of instances with diverse background context. To address these we propose novel method named Intra- Inter-Instance Collaborative Learning (I3CL). Specifically, to first issue, design an effective convolutional module multiple receptive fields, which is able collaboratively learn better character gap feature representations local long ranges inside instance. second devise instance-based transformer exploit dependencies between different global context semantic from shared background, are more discriminative representation. In this way, I3CL can effectively intra- inter-instance together unified end-to-end trainable framework. Besides, make full use unlabeled data, semi-supervised learning leverage pseudo labels via ensemble strategy. Without bells whistles, experimental results show that proposed sets new state-of-the-art on three challenging public benchmarks, F-measure 77.5% ArT, 86.9% Total-Text, 86.4% CTW-1500. Notably, our ResNeSt-101 backbone ranked 1st place ArT leaderboard. Code available www.github.com/ViTAE-Transformer/ViTAE-Transformer-Scene-Text-Detection .
منابع مشابه
A robust arbitrary text detection system for natural scene images
Text detection in the real world images captured in unconstrained environment is an important yet challenging computer vision problem due to a great variety of appearances, cluttered background, and character orientations. In this paper, we present a robust system based on the concepts of Mutual Direction Symmetry (MDS), Mutual Magnitude Symmetry (MMS) and Gradient Vector Symmetry (GVS) propert...
متن کاملArbiText: Arbitrary-Oriented Text Detection in Unconstrained Scene
Arbitrary-oriented text detection in the wild is a very challenging task, due to the aspect ratio, scale, orientation, and illumination variations. In this paper, we propose a novel method, namely Arbitrary-oriented Text (or ArbText for short) detector, for efficient text detection in unconstrained natural scene images. Specifically, we first adopt the circle anchors rather than the rectangular...
متن کاملArbitrary-Oriented Scene Text Detection via Rotation Proposals
This paper introduces a novel rotation-based framework for arbitrary-oriented text detection in natural scene images. We present the Rotation Region Proposal Networks (RRPN), which is designed to generate inclined proposals with text orientation angle information. The angle information is then adapted for bounding box regression to make the proposals more accurately fit into the text region in ...
متن کاملPixelLink: Detecting Scene Text via Instance Segmentation
Most state-of-the-art scene text detection algorithms are deep learning based methods that depend on bounding box regression and perform at least two kinds of predictions: text/nontext classification and location regression. Regression plays a key role in the acquisition of bounding boxes in these methods, but it is not indispensable because text/non-text prediction can also be considered as a ...
متن کاملMultiple-Instance Learning for Natural Scene Classification
Multiple-Instance learning is a way of mod-eling ambiguity in supervised learning examples. Each example is a bag of instances, but only the bag is labeled-not the individual instances. A bag is labeled negative if all the instances are negative, and positive if at least one of the instances in positive. We apply the Multiple-Instance learning framework to the problem of learning how to classif...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: International Journal of Computer Vision
سال: 2022
ISSN: ['0920-5691', '1573-1405']
DOI: https://doi.org/10.1007/s11263-022-01616-6